SemTab 2024 - STI vs LLMs Track

Description

This track involves using ONLY LLMs to perform the CEA task using Wikidata. Participants will be tasked with fine-tuning or simply prompting an LLM using a dataset containing semantic annotations. This task entails various challenges, including identifying methods to inject factual knowledge from a knowledge graph (KG) within an LLM, devising strategies to manage Wikidata Qids, enriching the training dataset to enhance the LLM's disambiguation capabilities, limiting hallucination problems, and constructing effective prompts for fine-tuning/annotation purposes. The objective is to harness LLMs' extensive capabilities to generate accurate annotations for the CEA task, thereby advancing their utility in semantic enrichment endeavours. Additionally, participants must submit their annotations for evaluation on the test set, ensuring the effectiveness of their approaches in real-world scenarios.

Matching Tasks:

CEA Task: Matching a cell to a Wikidata entity

Task overview

Participants are provided with tabular data containing columns with entity mentions. These entity mentions have to be annotated with corresponding entities from Wikidata. The annotation should contain the URI of the Wikidata entity (the prefix http://www.wikidata.org/entity/ is not mandatory). The prompt used for performing the CEA must also be sent.

Evaluation Criteria

Precision, Recall and F1 Score are calculated: \[Precision = {{correct\_annotations \#} \over {submitted\_annotations \#}}\] \[Recall = {{correct\_annotations \#} \over {ground\_truth\_annotations \#}}\] \[F1 = {2 \times Precision \times Recall \over Precision + Recall}\]

Notes:

# denotes the number.
\(F1\) is used as the primary score, and \(Precision\) is used as the secondary score.
One target cell, one ground truth annotation, i.e., # ground truth annotations = # target cells. The ground truth annotation has already covered all equivalent entities (e.g., wiki page redirected entities); the ground truth is hit if one of its equivalent entities is hit.

Round 1

Datasets

SuperSemtab 24:

Target Knowledge Graph: Wikidata. For offline, use March 20, 2024.

Datasets' Structure

The dataset is divided into training and validation sets. The dataset includes general-purpose tables and intentionally misspelt entities to evaluate the model's robustness. Participants must annotate the entity mentions in the validation set and submit their annotations (following a target file).

Supported Task

CEA

Targets Format

filename, column id, row id

Participate!

Submission: Are you ready? Then, submit the results of the test set and the model Submission URL.

Round 2

Datasets

MammoTab 24 (SemTab):

Target Knowledge Graph: Wikidata. For offline, use March 20, 2024 dump.

Datasets' Structure

The dataset is divided into training and validation sets. The dataset includes general-purpose tables and intentionally misspelt entities to evaluate the model's robustness. Participants must annotate the entity mentions in the validation set and submit their annotations (following a target file).

Supported Task

CEA

Targets Format

filename, column id, row id

Participate!

Submission: Are you ready? Then, submit the results of the test set and the model Submission URL.

STI vs LLMs Track

Description

Task overview

Evaluation Criteria

Round 1

Datasets

Datasets' Structure

Supported Task

Targets Format

Participate!

Round 2

Datasets

Datasets' Structure

Supported Task

Targets Format

Participate!

Track Organizers